Solving Headswitching Translation Cases in LFG-DOT

نویسندگان

  • Andy Way
  • Miriam Butt
چکیده

It has been shown that LFG-MT (Kaplan et al., 1989) has difficulties with Headswitching data (Sadler et al., 1989, 1990; Sadler & Thompson, 1991). We revisit these arguments in this paper. Despite attempts at solving these problematic constructions using approaches based on linear logic (Van Genabith et al., 1998) and restriction (Kaplan & Wedekind, 1993), we point out further problems which are introduced. We then show how LFG-DOP (Bod & Kaplan, 1998) can be extended to serve as a novel hybrid model for MT, LFG-DOT (Way, 1999, 2001), which promises to improve upon the DOT model of translation (Poutsma 1998, 2000) as well as LFG-MT. LFG-DOT improves the robustness of LFG-MT through the use of the LFG-DOP Discard operator, which produces generalized fragments by discarding certain f-structure features. LFG-DOT can, therefore, deal with ill-formed or previously unseen input where LFG-MT cannot. Finally, we demonstrate that LFG-DOT can cope with such translational phenomena which prove problematic for other LFG-based models of translation. 1 Headswitching in LFG-MT Kaplan et al. (1989) illustrate their LFG-MT proposal with the well-known headswitching case venir de X has just X-ed, as in (1): (1) The baby just fell Le bébé vient de tomber. They propose to deal with such problems in two ways. The first of these is as in (2): (2) just: ( PRED ) = ‘just ARG ’, ( PRED) = venir, ( XCOMP) = ARG) That is, the XCOMP function of venir (in (1), de tomber) corresponds to the ARG function of just (in (1), the baby fell), as shown by the respective source and target f-structures in (3) and (4): (3) PRED ‘just [fall] ’ ARG SUBJ PRED ‘baby’ SPEC the TENSE PAST PRED ‘fall [baby] ’ (4) SUBJ PRED ‘bébé’ SPEC le TENSE PRES PRED ‘venir [bébé],[tomber] ’ XCOMP DE + SUBJ PRED ‘tomber [bébé] ’ The second approach is where just is not treated as a head subcategorizing for an ARG, but as a ‘normal’ adverbial sentential modifier. Instead, headswitching occurs between source and target f-structures, as in (5): (5) S NP ADVP VP ( SUBJ)= ( SADJ)= = ( SADJ XCOMP) = just: ADV, ( PRED) = just fall: V, ( PRED) = fall ( PRED) = venir ( SUBJ) = SUBJ) ( PRED) = tomber Here the annotation to ADVP states that the of the mother f-structure is the XCOMP of the of the SADJ slot. This set of equations (along with others of a more trivial nature) produces the f-structure (6): (6) SUBJ PRED ‘baby’ SPEC the TENSE PAST PRED ‘fall [baby] ’ SADJ ! PRED ‘just’ "$# 1.1 Embedded Cases of Headswitching However, Sadler et al. (1989, 1990) show that neither approach is able to deal elegantly and straightforwardly with more complex cases of headswitching, as in (7): (7) I think that the baby just fell Je pense que le bébé vient de tomber. In (7), the headswitching phenomenon takes place in the sentential COMP, rather than in the main clause, as in (1). Here the structure in (3) must be a COMP to a PRED in a higher f-structure. Hence, the normal f-description on embedded S nodes ( COMP = ) must be optional, and instead the structure in (3) must be unified to the root f-structure as the value of its COMP node. This can be handled by the disjunction in (8): (8) VP V that S ( COMP)= , ( COMP ARG)= # We require this disjunction on embedded S nodes to include ( COMP ARG)= just in case they contain such a headswitching construction, as f-structure (9) shows: (9) SUBJ PRED ‘I’ " PRED ‘think [i],[just] ’ COMP PRED ‘just [fall] ’ ARG SUBJ PRED ‘baby’ SPEC the TENSE PAST PRED ‘fall [baby] ’ Otherwise, structure (3) (rooted in just) is not connected to the higher COMP slot. Nevertheless, the solution proposed in (8) seems a little ad hoc, requiring a disjunction just in case the sentential COMP includes a headswitching case. We shall see in the next section that if such headswitching adverbs co-occur, then further disjuncts are required, unless these can be abbreviated by a functional uncertainty equation. If we choose the second approach (5), where just is a sentential modifier, given that the headswitching is a operation, we require the lexical entry for think in (10): (10) think: V, ( PRED) = penser, ( SUBJ) = ( SUBJ), ( COMP) = ( COMP) This specifies that of the mother f-structure’s COMP slot is the COMP of the of the mother’s f-structure. That is, both this argument, the COMP, and the SUBJ of think are to be translated straightforwardly. This is indeed the case in (11): (11) I think that the baby fell Je pense que le bébé est tombé. However, when the COMP includes a headswitching case, as in (7), we end up with a doubly rooted target fstructure because of a clash between the regular equation in the lexical entry for think, (10), and the structural equation on the ADVP in the (5), which requires the of the same piece of f-structure to be the XCOMP of the of the SADJ slot. One piece of f-structure is required to fill two inconsistent slots. We will now illustrate this in detail. The cand f-structures for the source sentence in (7), I think that the baby just fell, are shown in (12):

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LFG-DOT: Combining Constraint-Based and Empirical Methodologies for Robust MT

The Data-Oriented Parsing Model (DOP, [1]; [2]) has been presented as a promising paradigm for NLP. It has also been used as a basis for Machine Translation (MT) — Data-Oriented TVanslation (DOT, [9]). Lexical Functional Grammar (LFG, [5]) has also been used for MT ([6]). LFG has recently been allied to DOP to produce a new LFG-DOP model ([3]) which improves the robustness of LFG. We summarize ...

متن کامل

LFG-DOT: a probabilistic, constraint-based model for machine translation

We develop novel models for Machine Translation (MT) based on Data-Oriented

متن کامل

Data-Oriented Models of Parsing and Translation

The merits of combining the positive elements of the rule-based and data-driven approaches to MT are clear: a combined model has the potential to be highly accurate, robust, cost-effective to build and adaptable. While the merits are clear, however, how best to combine these techniques into a model which retains the positive characteristics of each approach, while inheriting as few of the disad...

متن کامل

TIGER TRANSFER Utilizing LFG Parses for Treebank Annotation

Creation of high-quality treebanks requires expert knowledge and is extremely time consuming. Hence applying an already existing grammar in treebanking is an interesting alternative. This approach has been pursued in the syntactic annotation of German newspaper text in the TIGER project. We utilized the large-scale German LFG grammar of the PARGRAM project for semi-automatic creation of TIGER t...

متن کامل

LFG for Chinese: Issues of Representation and Computation

LFG has been widely used to analyze English language as well as other languages from linguistic point of view [Joan Bresnan 2001; Louisa Sadler 1996], including Chinese [Lian-Cheng Chief 1996; One-Soon Her. 1997]. A new direction in LFG research field is applying it to language computation, ranging from parsing to machine translation [Louisa Sadler, Josef van Genabith, and Andy Way 2000; Mark J...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001